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1. INTRODUCTION 

The rapid expansion of the economy has resulted in considerable challenges in fire management due 
to the increased scale and intricacy of projects. Detecting fires early and accurately is crucial in minimizing 
fire-related damages. Therefore, having reliable fire detection and alarm systems that possess high sensitivity 
and precision is essential. Traditional fire detection systems [1], [2], such as those that rely on heat and 
smoke detectors, may find them inadequate in larger spaces in complex buildings or environments with 
multiple sources of interference. The limitations of these methods can lead to missed detections. False 
alarms, delays in recognizing real fires, and other challenges make it difficult to provide timely fire warnings. 
Fire detection has recently become a popular research topic as it offers several benefits, including early fire 
detection, high accuracy, and the ability to identify fires in large areas and complex building systems [3]. 

Studies on fire detection based on video and image processing have appeared widely after the 
development of cameras and artificial intelligence. For identifying motion pixels in the video, Toreyin et al. [4] 
presented a Gaussian mixture background estimation approach. This approach uses a color model to identify 
possible fire locations, then uses wavelet analysis in the spatial and temporal dimensions to assess high 
frequency activity in the area. In practice, this approach, like the prior problem, has high computational 
complexity. 

Han et al. [5] successfully detected motion in the lab using a multicolor model and a Gaussian mixture 
model, but these methods cannot be used in real-world applications thus, they take a large amount of processing 
time. Khan et al. [6] proposed a video-based approach that employs fire dynamics and static indoor fire 
identification based on the color, area, roundness, and perimeter of the fire. A small amount of fire, like in a 
candle, is used as a supplementary component of their technique. Because it eliminates and then uses flame 
development aspects to analyze, this technique may have a significant fault in the early detection of fire. 
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Khalil et al. [7] introduced a novel fire detection approach based on Commission Internationale de l’Eclairage 
(CIE) L*a*b* and red, green and blue (RGB) color spaces by combining motion detection with flame object 
monitoring and calculating the rate of flame growth in the video. This method enhances fire detection accuracy and 
produces decent results, but it has a significant frequency of false positive alerts and is unstable for complex words. 

Deep learning is currently a popular area of research due to its remarkable accuracy in recognizing 
patterns across a diverse set of applications. For fire detection, the researchers employed a deep learning 
algorithm [8], [9], and excellent accuracy was achieved. The utilization of deep learning technology could 
potentially address issues encountered in the fire detection process. But there are certain limitations. Deep 
learning, for example, when dealing with large volumes of data, can improve accuracy. Despite this, the 
camera collects fewer instances of flames and actual flame samples. Training for deep learning demands 
powerful equipment and consumes a significant amount of time. As an illustration, the flame dataset from 
Alves et al. [10] includes 800 images. 

This research addresses the challenges that still exist in fire detection video technology by proposing a 
camera-based automatic fire detection approach. The proposed method is applicable to both enclosed and open 
spaces and employs multi-domain technology to surpass the current limitations of the system. The proposed 
method involves recognizing the flame of the fire in YCbCr and hue, saturation, value (HSV), color space using 
frame difference and Otsu’s method. Additionally, a new method is introduced during the preprocessing step 
that involves the integer Haar lifting wavelet transform to not only decrease the size of the processed data but 
also produce more effective features. 


2. METHOD 

A five-step approach is proposed for fire detection: 1) preprocess input data with a wavelet transform; 
2) use Otsu’s technique to classify fire pixels; 3) detect fire motion with frame differences; 4) fire and non-fire 
objects can be distinguished using a two-color space model; and 5) compute flame area. See Figure 1 for a 
detailed explanation of each step. The video is framed to enable fire detection functions. 


2.1. Pre-processing (wavelet transforms) 

The integer Haar lifting wavelet transform (Int-to-Int-HLWT) is a method used in this study to reduce 
processing time. The wavelet transform differs from the Fourier transform by using infinite basic functions to 
represent a signal. The wavelet transform analyzes signals across time and frequency domains, where the longer 
duration of low-frequency signals provides better resolution for higher-frequency signals [11]. 

Each frame is separated into four parts: high-high (HH), low-high (LH), high-low (HL), and low-low 
(LL) in the Int-to-Int-HLWT technique, and the low-band frequency (LL) is utilized for processing. The Haar 
filter, which is commonly used in conjunction with the discrete wavelet transform, is used to compute the 
approximation and detailed coefficients [12]. The overarching objective of the Int-to-Int-HLWT technique is 
to curtail the extent of data storage capacity by a staggering 75%, thereby facilitating expedited processing 
time while simultaneously safeguarding crucial data. 


2.2. Otsu’s algorithm 

Otsu’s threshold selection method is a simple and effective technique for processing grayscale color 
frames, as proposed by Nobuyuki Otsu in 1979 [13]. Figure 1 illustrates the classic Otsu algorithm for establishing 
a threshold value. After successful segmentation of the fire frame, the color distribution becomes restricted to black 
(0) and white (1). The flame is denoted by white (1) and the background by black (0). To improve the results, a 
morphological approach was used to remove small pixels that were unrelated to the fire [14]. 


2.3. Frame difference method for motion detection 

The flame’s form is uneven and varies frequently due to the dynamic properties of fire. When fire is 
employed as a prominent characteristic in motion identification, common detection methods involve continuous 
frame changes [15], mixed Gaussian background modeling [16], and background subtraction [17]. Due to the 
significant day and night difference, background subtraction must establish the backdrop appropriately. It’s 
challenging to have a constant background, and parameters must be defined, which is more intricate than a static 
background. Preprocessing is required to determine the history frame, Gaussian mixture number, background 
update rate, and noise in the mixed Gaussian model, which is excessively complicated. 

The frame difference method is easy to use, doesn’t require a lot of programming, isn’t affected by 
changes in the scene like lighting, and can quickly adjust to changing circumstances. However, it doesn’t detect 
motion in consecutive frames. Therefore, this research uses an enhanced frame difference approach that employs a 
new method due to continual shifts in flame pixels caused by airflow and combustion qualities [18]. The enhanced 
frame difference method involves transforming the video stream into a frame image, grayscale processing to 
combine RGB channels, and subtracting after eight frames where the pixel’s flame has changed the most. 
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laçr,k+8) = |+) — Ikl (1) 


In video, I, is represented to be the value of the kt” frame. The value of the (k + 8)*” frame in the video is lek+8): 
The motion detection frame must be binarized before proceeding to the color detection step, also using 
morphological operations to neglect the small white pixels [14]. 
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Figure 1. The suggested flame detection system 


2.4. The two-color detection 

The color of a flame is frequently identified as its most striking attribute. It is widely used to 
distinguish fire from other items. As a result, the suggested technique’s third phase is color detection, that 
combines the YCbCr and HSV color spaces to identify potential fire zones. 


2.4.1. HSV color space 

The image is numerically represented as an m X n X 3 arrays with numbers between [0, 1]. A third 
dimension of HSV defines the hue, saturation, and value for each pixel. The hue is a value ranging from 0 to 1 
that denotes the location of a particular color on a color wheel. By increasing from 0 to 1, the hue progresses 
through a spectrum of colors, starting with red and moving on to orange, yellow, green, cyan, blue, and magenta 
before returning to red. On the other hand, saturation relates to the intensity of color or degree of deviation from 
neutrality. A zero value represents a neutral shade, while a value of one represents the highest level of 
saturation. The color’s value is determined by its red, green, and blue components, with the maximum value 
being taken. The HSV color can be produced using the non-linear RGB transformation (2)—(4) [19]. 


_ (0 ifb<g Z 1 r=) +78) 
= (360 ifb >g where 0 = cos Ta rE 2) 
v = max(r,g,b) (3) 
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g= v-mi mran (4) 

Given the range of colors that fire exhibits, including yellow, red, and white at higher temperatures, 
we have chosen to use the HSV color system in this particular scenario. After conducting several tests, we 
determined an optimal threshold for segmenting flame colors. The following equation provides a clear 
representation of this threshold: 


0<H<0.2 0.47 < S < 0.98 0.7 < V < 0.98 (5) 


Where V, S, and H denote the value, saturation, and hue elements of a frame. The frame is divided into two 
parts by these thresholds: the foreground denotes fire colors, while the background denotes non-fire colors. 
To determine the color of the flame in the HSV color space, the results of each channel are added together. 
Small pixels are often represented as noise, which is removed using morphological procedures [14]. In the 
final stage of the section, the binarized frame is generated with the aim of combining the flame color 
information in the HSV color space with that of the YCbCr color space using the logical operator AND. 


2.4.2. YCbCr color space 

The YCbCr color scheme is widely used in digital video components to represent color as 
luminance and two color difference signals. The luminance component is denoted by Y, while the 
chrominance-blue and chrominance-red components are represented by Cb and Cr. The YCbCr color space 
has the feature of better discriminating between chrominance and brightness, making it a preferred choice for 
testing the effectiveness of various color spaces in distinguishing fire pixels [20]. 

The RGB color space can distinguish between a variety of colors, but it is sensitive to changes in 
lighting. This means that the fire detection color rules will not work properly if the lighting in the frame 
changes. In order to tackle this problem, it is necessary to transform the RGB color space into a color space 
that offers improved discrimination in terms of intensity and chrominance. To achieve this, the YCbCr color 
space can be obtained by applying the subsequent formula for the conversion of RGB [21]. 


Y 0.2568 0.5041 0.0979] [R 16 
Cb| = | -0.1482 —0.2910 0.4392 | |G| + |128 (6) 
Cr 0.4392 —0.3678 —0.071441 LB 128 


The YCbCr color space decomposes a frame into three components: luminance (represented by Y) and 
chrominance-blue and chrominance-red components (represented by Cb and Cr, respectively). The mean 
values of these components can be calculated for a specific frame. 


1 1 1 
Ymean = k ti Y (xi yi), Cbmean = pict Cb(x;, Yi), and Cr nean = plist Cr(x;, Yi) (7) 


The spatial position of a pixel is denoted by (xi, yi), while the mean luminance and chrominance values are 
represented by Y-mean, Cb-mean, and Cr-mean. K signifies the number of pixels in a frame. Notably, in frames 
depicting fire, the brightness of the flame surpasses that of chrominance-blue, and chrominance-red is higher than 
chrominance-blue. This fact is evident from the frames, as exemplified in Figure 2(a), Figure 2(b), Figure 2(c), and 
Figure 2(d). Thus, rule one can be formulated as: 


1,f Y@,y) > Cb(x,y) UCr(x, y) > Ch(x, y) 


Rule 1: F(x, yÍ 0, otherwaise 


(8) 

In addition to (8), as the flame zone is frequently the brightest area in the observed picture, it is also useful to 
know the mean values of the three components, Y-mean, Cb-mean, and Cr-mean. The value of the Y 
component in the fire zone is greater than the mean Y component for the entire frame, but the value of the Cb 
component is often lower than the mean Cb value for the entire frame. Moreover, the flame region’s Cr 
component exceeds the mean Cr component [6], which may be summarized as the following rule: 


1, if Y(x, y) > Ymean U Cb(x, y) < Cbmean U Cr(x, y) > Cimean 


Rule 2 : F(x, y) te otherwaise 


(9) 

As a result, the YCbCr color space-selected zone of flame can be satisfied by combining the two 
rules. The HSV and YCbCr color space rules are then combined using the binary AND operator to create the 
two-color model. Which is then applied to a frame to find the fire regions of interest, which are defined as 
Rotor (i,j, n). 
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(a) (b) (c) (d) 


Figure 2. The Y, Cb, and Cr channels of the RGB input frame: (a) the initial RGB frame, (b) the Y channel, 
(c) the Cb channel, and (d) the Cr channel 


2.5. Combining Otsu’s algorithm, the frame differences and two-color 

Using Otsu’s threshold or frame differences, or two-color detection alone to specify fire would lead to 
a lot of false alarms due to the complex nature of the attributes of fire mentioned earlier. Therefore, we need to 
integrate the outputs of all three approaches, as shown in Figure 3(a), Figure 3(b), Figure 3(c), and Figure 3(d), 
to fully exploit their properties and accurately identify the fire area Rfire (i, j, n) using (10). 


Rfire(i jn) = binary image (i, j,n) N Rcoior (i j, n) N lackk+8) j n) (10) 


This combined approach is illustrated in Figure 3(e), where the flame region is determined and bounded by a 
green box. The fire boundary is subtracted from the original RGB frame to get the area of the bounded zone, 
and if it is above a certain threshold, it is considered a fire. The region’s criteria for fire detection are set at a 
minimum of 55. 


(a) (b) (c) (d) (e) 


Figure 3. Results of combination: (a) the original frame; (b) result the automatic threshold; (c) motion 
detection result; (d) two-color detection result; and (e) the outcome of combining (b), (c), and (d) 


3. RESULTS and DISCUSSION 

The proposed proposal is implemented with MATLAB, version R2021b, and on a PC with an Intel 
Core i7 2.70GHz CPU, 16GB of RAM, and the Windows 10 operating system. The test video database is 
compiled in real time and off-line [14] with an assortment of diverse circumstances, including a variety of 
backdrops and environmental conditions. 

A real-time outdoor flame is shown in Figure 4, and three different fire video scenes (F62, F61, and 
F56). The color of the sun is known to be identical to that of the flame, yet the system only recognizes the 
flame. Table 1 summarizes the real-time experiment findings, where Nn stands for both the total number of 
video frames and the total number of fire frames. The suggested technique’s Nd stands for the number of 
frames successfully identified, and Rd stands for the rate at which a video detects fire. 


Rd = Na/Nn (11) 
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The average detection rate for real-time video can exceed 93%. The most important factor is the time 
required to identify a fire. Consequently, the suggested system can detect a fire in less than 0.26 seconds, 
allowing for the detection to occur in real-time. 

Figure 5(a) to Figure 5(e) show a recorded video of the results of testing in six distinct scenarios. Fire 
detection was not limited to the database video in order to cover the largest number of forecasted forest fire 
episodes and assess the efficiency of the proposed method, as shown in Figure 5(a), Figure 5(c), and Figure 5(d). 
It is also worth noting that the algorithm may disregard the impact of the fire-color backdrop regions depicted in 
Figure 5(e). Based on color and other characteristics, we compare the proposed approach to previous fire 
detection systems. Chen et al. [22] used RGB and HIS color spaces, Celik et al. [23] used RGB color space, and 
Marbach et al. [24] used YUV color space, while Shidiks’s method [14] combines RGB, YCbCr, and HSV as 
multicolor features with background subtraction based on color and other parameters. We compare the proposed 
technique to earlier fire detection systems. Our system achieved an average detection rate of 99% for the 
identical fire database, as demonstrated in Table 2, presenting the experimental findings. 

In terms of detection rates, our proposed system beats earlier techniques. However, because the 
background of the video “barbeq.avi” is simple and constant, Shidik’s methodology outperforms ours. Video 
“Controlled1.avi” exhibits a high detection rate using the Chen, Celik, and Marbach methods. The scenario is 
simpler to notice in movies like Controlled2, Forestfirel, and Forest 1—4 because there are no distractions 
from flame, such as moving objects, and the features of flame are clearly identified. As a consequence, when 
applied to each of these movies, practically all of these techniques provide the same detection rates. 

Table 3 displays the amount of false positive frames generated by various methodologies. Nr, which 
means the number of frames that do not contain the fire but are given an alarm, is fire detection. Moves 1, 2, 3, 
and 4 are represented by a passing fire-colored vehicle, three people entering the room, road transport, and 
a dancing person wearing fire-colored clothing [25]. Table 3 demonstrates that our approach achieved a lower 
average false positive rate compared to other strategies, indicating its superior performance. Moreover, except 
for mov 4, the approach we presented generated better outcomes in every video. To reduce the number of false 
positives, future research should include additional characteristics. Our method demonstrates superior 
performance compared to other alternatives in terms of both rapid detection and effectiveness, as evidenced by 
the preceding explanation. 


Figure 4. Real-time flame detection result 


Table 1. Display the outcomes of the suggested approach (real-time) 
Video Nn Nd Rd 
F62 49 39 0.796 
F61 97 96 0.989 
F56 15 15 1.000 
Total 161 150 0.931 


Table 2. Display the outcomes of the suggested approach (offline) 

Database The proposed Chen [22] Celik [23] Marbach [24] Shidiks [14] 
Video Nn Nd Rd Nd Rd Nd Rd Nd Rd Nd Rd 
Barbeq 439 430 0.979 412 0.959 415 0.945 400 0.911 439 1.000 

Controlled! 260 250 0.961 259 0.996 259 0.996 259 0.996 105 0.404 
Controlled2 246 246 1.000 246 1.000 246 1.000 246 1.000 246 1.000 
Controlled3 208 208 1.000 207 0.995 207 0.995 207 0.995 208 1.000 
Forest! 200 200 1.000 200 1.000 200 1.000 200 1.000 200 1.000 
Forest2 245 245 1.000 245 1.000 245 1.000 245 1.000 245 1.000 
Forest3 255 254 0.996 254 0.996 254 0.996 254 0.996 254 0.996 
Forest4 219 218 0.995 218 0.995 218 0.995 218 0.995 218 0.995 
Forestfire 218 218 1.000 218 1.000 218 1.000 218 1.000 218 1.000 


Total 2290 2269 0.990 2259 0.986 2262 0.987 2247 0.981 2133 0.931 
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Figure 5. The flame detection results for recoded video in many scenarios: (a) flame under the sun, (b) flame 
in the forest, (c) two flame detections, (d) small flame detections, and (e) flame with heavy smoke 


Table 3. False positive frames in video 


Video N; Ours Chen [22] Celik [23] Marbach [24]  Shidik [14] 
Mov! 0 7 10 23 21 13 
Mov2 0 4 8 10 12 6 
Mov3 0 0 0 0 0 0 
Mov4 0 27 26 34 39 30 
Total 0 9.5 11 16.75 18 12.25 


4. CONCLUSION 

This paper offered an autonomous method for detecting fire over a video stream. The proposed 
approach for fire detection involves five stages. Firstly, the input video is pre-processed using integer Haar 
lifting wavelet transforms to decompose it and reduce data size while preserving information. This reduces 
flame detection time by at least 0.26 seconds. Secondly, an automated threshold selection technique utilizing 
Otsu’s method is used to identify flame intensity pixels. Thirdly, frame differences are used to detect fire 
motion. Fourthly, the YCbCr/HSV color space models are employed to identify likely flame regions. Finally, 
the fire area is calculated using a simple and innovative approach. The fire zones are then determined by 
combining the results. This approach is currently being tested using multiple video feeds. According to the 
experimental results, the approach achieves 99% accuracy for offline videos and surpasses 93% accuracy for 
real-time video. Despite its simplicity, the system is quick, efficient, and minimally complex. 
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